-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Add skiplist optimization to auto_date_histogram aggregation #20057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add skiplist optimization to auto_date_histogram aggregation #20057
Conversation
|
❌ Gradle check result for 27f8248: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Functionally correct but not showing improvement QueryResult |
|
@asimmahmood1 - Have you verified that the regression is only with |
|
So I figured out why performance was not up to par. Short answer is when auto date moves onto large intervals, we need to track of not just preparedRounding but also bucketOrds. If preparedRounding has changed since last time, we need to restart the skiplist logic. Otherwise, we'll collect too many docs, and although the end result doesn't change (i.e. unit test passes), performance is too low. Auto date histo has two modes: FromSingle and FromMany. FromSingle is often used in top level aggregation, so It's similar to Date Histogram where parent is null. FromMany is used e.g in big5's See validation below. Note: will update this PR after #19573 is merged. |
range auto date: from 1335 to 139 (89%) |
range-auto-date-with-metrics (22% lower)this is similar to the date-with-metrics, since the time is bounded by tavg stat |
|
❌ Gradle check result for 4d7f22d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
{"run-benchmark-test": "id_3"} |
|
{"run-benchmark-test": "id_11"} |
|
❌ Gradle check result for 4d7f22d: Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/5174/ . Final results will be published once the job is completed. |
Benchmark ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/5174/
|
Benchmark Baseline Comparison ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/210/
|
Signed-off-by: Asim Mahmood <asim.seng@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (2)
1052-1052: Unused field may be orphaned code.The
filterFieldis added to every document but is never used in any query or assertion in the tests. Consider either:
- Removing it if it's not needed
- Adding a comment explaining its intended future use
1057-1063: Consider expanding the comment for clarity.The comment briefly states "delete a doc to avoid approx optimization" but doesn't explain the mechanism. A slightly more descriptive comment would help future maintainers understand why deleted documents prevent the approximation path.
- // delete a doc to avoid approx optimization + // Add and delete a document to create deleted docs in the segment. + // This prevents the filter rewrite approximation optimization from being used, + // ensuring the skiplist collector path is exercised during collection. if (enableSkiplist) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
CHANGELOG.md(1 hunks)server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (1)
server/src/main/java/org/opensearch/index/mapper/NumberFieldMapper.java (1)
NumberFieldMapper(102-2185)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: Analyze (java)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: Mend Security Check
🔇 Additional comments (8)
server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (8)
36-51: LGTM!Imports are properly added and all are used in the new test infrastructure.
102-103: LGTM!The comment addresses the previously requested explanation for using
@timestamp. This aligns with OpenSearch conventions where@timestampis commonly used for time-series data.
994-1003: LGTM!Clean backward-compatible overload pattern that maintains default behavior while allowing tests to opt into the skiplist path.
1094-1130: LGTM!Well-structured test that creates a dataset spanning both hourly and monthly granularities to exercise the rounding change behavior. The assertions properly verify histogram correctness, bucket count limits, document count totals, and temporal ordering.
1137-1178: LGTM!Good test coverage for the interaction between the skiplist collector and sub-aggregations. Verifies that stats sub-aggregations are correctly populated even when rounding changes occur during collection.
1185-1225: LGTM!Comprehensive test for bucket merging behavior. The dataset design (hourly data + yearly quarterly data) forces significant rounding changes, and the hardcoded expected values correctly account for how the 140 documents should be distributed across yearly buckets.
1231-1284: LGTM!Excellent addition that addresses the previously requested equivalence test. The randomized approach with 20-200 documents and random date distribution provides good coverage. The comparison of bucket counts, keys, and doc counts ensures the skiplist path produces identical results to the non-skiplist path.
1286-1307: LGTM!Clean helper method that encapsulates the aggregation execution logic, avoiding code duplication in
testSkiplistEquivalence.
Signed-off-by: Asim Mahmood <asim.seng@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (2)
102-103: Consider expanding the comment to explain why@timestampis used.Per past review feedback, the comment could clarify that
@timestampis conventionally configured withskip_list: truein typical log/metrics index mappings, making it a suitable choice for testing skiplist behavior.- // @timestamp field name by default uses skip_list + // @timestamp field name is used because it conventionally has skip_list enabled + // in typical log/metrics index mappings, making it suitable for skiplist tests private static final String DATE_FIELD = "@timestamp";
1056-1062: Clarify the optimization being avoided and consider usingTermdirectly.The comment could be more specific about which approximation optimization is being avoided. Additionally, using
Termdirectly is more idiomatic for document deletion.- // delete a doc to avoid approx optimization + // Add and delete a document to ensure segments have deletions, + // which disables the count approximation optimization and forces + // the skiplist collector path to be exercised if (enableSkiplist) { document.add(new StringField("someField", "a", Field.Store.NO)); indexWriter.addDocument(document); indexWriter.commit(); - indexWriter.deleteDocuments(new TermQuery(new Term("someField", "a"))); + indexWriter.deleteDocuments(new Term("someField", "a")); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (2)
plugins/arrow-flight-rpc/src/main/java/org/opensearch/arrow/flight/stats/FlightMetrics.java (1)
Histogram(409-449)server/src/main/java/org/opensearch/index/mapper/NumberFieldMapper.java (1)
NumberFieldMapper(102-2185)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: Analyze (java)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
🔇 Additional comments (5)
server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (5)
1093-1129: LGTM!The test correctly validates skiplist collector behavior with rounding changes. The dataset construction (24 hourly + 30 spanning months) effectively triggers rounding changes, and the assertions verify the expected outcomes.
1136-1177: LGTM!Good coverage for verifying sub-aggregations work correctly when skiplist collector handles rounding changes. The test properly validates that stats sub-aggregations are present and have valid values for non-empty buckets.
1184-1224: LGTM!Excellent test for bucket merging behavior. The expected counts are correctly calculated:
- 2020: 120 hourly docs (5 days × 24 hours) + 4 quarterly docs = 124
- 2021-2024: 4 quarterly docs each
The explicit
expectedDocCountmap provides strong validation of correct bucket merging after rounding increases.
1230-1283: LGTM - addresses past review feedback!This test directly addresses the reviewer request for comparing skiplist vs non-skiplist results on random data. The randomized approach with sorted dataset ensures thorough coverage while maintaining deterministic comparison between the two paths.
1285-1306: LGTM!Clean helper method that properly manages resources and follows the established patterns in the test class.
|
❌ Gradle check result for 350729d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
...in/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregator.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Asim Mahmood <asim.seng@gmail.com>
Signed-off-by: Asim Mahmood <asim.seng@gmail.com> (cherry picked from commit dfef2c1) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Signed-off-by: Asim Mahmood <asim.seng@gmail.com> (cherry picked from commit dfef2c1) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Description
This is a built on top #19573.
Compare to date histogram skiplist, this change needs to hook into dynamic rounding during collection. There are 2 variables to keep track of:
bucketOrds- seen rounded datespreparedRounding- starts at lowest interval: MINUTES and goes upWhen a new ord is created,
increaseRoundingIfNeededfunction is called to determine if newpreparedRoundingneeds to kick in (e.g. from HOURS to DAYS), and may also merge dates inbucketOrds. Thus, both are need to be supplied via lambda.In the future skiplist can be enhanced to keep track of multiple
owningBucketOrd, for now it only works when auto date histogram is root (parent == null), or within range filter rewrite context that guarantees new auto date histogram is created per range.Related Issues
Resolves #19827
Part of #18882
Also #19384
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
New Features
Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.